Visual Reasoning


Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning

Add code
Mar 27, 2025
Viaarxiv icon

Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks

Add code
Mar 27, 2025
Viaarxiv icon

MAVERIX: Multimodal Audio-Visual Evaluation Reasoning IndeX

Add code
Mar 27, 2025
Viaarxiv icon

RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models

Add code
Mar 27, 2025
Viaarxiv icon

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

Add code
Mar 25, 2025
Viaarxiv icon

Online Reasoning Video Segmentation with Just-in-Time Digital Twins

Add code
Mar 27, 2025
Viaarxiv icon

Test-Time Reasoning Through Visual Human Preferences with VLMs and Soft Rewards

Add code
Mar 25, 2025
Viaarxiv icon

Cross-Modal State-Space Graph Reasoning for Structured Summarization

Add code
Mar 26, 2025
Viaarxiv icon

VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Add code
Mar 27, 2025
Viaarxiv icon

Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving

Add code
Mar 27, 2025
Viaarxiv icon